Skip to content

Conversation

TheBlueMatt
Copy link
Collaborator

We added the ability to block `ChannelMonitorUpdate`s on receipt of
an RAA in order to avoid dropping a payment preimage from a channel
that created a `PaymentSent` event in
9ede794e8e8559f1b2b386c1c57372094fc92fd4. We did not at the time
use the same infrastructure for `PaymentClaimed` events, but really
should have. While a `PaymentClaimed` event may seem a bit less
critical than a `PaymentSent` event (it doesn't contain a payment
preimage that the user needs to make sure they store for proof of
payment), its still important for users to ensure their payment
tracking logic is always correct.

Here we take the (relatively straightforward) action of setting a
`EventCompletionAction` to block RAA monitor updates on channels
which created a `PaymentClaimed` event. Note that we only block one
random channel from an MPP paymnet, not all of them, as any single
channel should provide enough information for us to recreate the
`PaymentClaimed` event on restart.

@TheBlueMatt TheBlueMatt added this to the 0.2 milestone Aug 4, 2025
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Aug 4, 2025

👋 I see @tankyleo was un-assigned.
If you'd like another reviewer assignment, please click here.

@TheBlueMatt TheBlueMatt mentioned this pull request Aug 4, 2025
24 tasks
Copy link

codecov bot commented Aug 4, 2025

Codecov Report

❌ Patch coverage is 91.03448% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.93%. Comparing base (0a23664) to head (a80c855).
⚠️ Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channelmanager.rs 87.50% 12 Missing ⚠️
lightning/src/ln/chanmon_update_fail_tests.rs 93.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3988      +/-   ##
==========================================
- Coverage   89.01%   88.93%   -0.09%     
==========================================
  Files         174      174              
  Lines      124395   124576     +181     
  Branches   124395   124576     +181     
==========================================
+ Hits       110730   110786      +56     
- Misses      11187    11292     +105     
- Partials     2478     2498      +20     
Flag Coverage Δ
fuzzing 22.18% <27.83%> (-0.04%) ⬇️
tests 88.75% <91.03%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@TheBlueMatt TheBlueMatt force-pushed the 2025-08-durable-claims branch 2 times, most recently from d0595d8 to 7261fdf Compare August 4, 2025 19:53
@TheBlueMatt TheBlueMatt marked this pull request as draft August 4, 2025 20:04
@TheBlueMatt
Copy link
Collaborator Author

CI is unhappy. This should be fine but gotta get the stupid multithreaded test to pass.

@TheBlueMatt TheBlueMatt force-pushed the 2025-08-durable-claims branch from 7261fdf to 4d12016 Compare August 4, 2025 21:50

let event_node: &'static TestChannelManager<'static, 'static> =
unsafe { std::mem::transmute(nodes[8].node as &TestChannelManager) };
let thrd_event = std::thread::spawn(move || {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize I also punted on this recently, but maybe we should take the opportunity to revisit using std::thread::scoped now that we can? If we decide against it, maybe we should drop the TODO comment above and elsewhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, yea, let me do that in a followup.

@TheBlueMatt TheBlueMatt marked this pull request as ready for review August 5, 2025 11:25
@TheBlueMatt
Copy link
Collaborator Author

Labeled for backport, but note that the first two commits should not be backported, so the second two might need some tweaks.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, it's nice we had the infra in place already to make this relatively straightforward.

@valentinewallace
Copy link
Contributor

Needs rebase

Comment on lines +1096 to +1104
let durable_preimage_channel = payment.htlcs.last().map_or(None, |htlc| {
if let Some(node_id) = htlc.prev_hop.counterparty_node_id {
Some((htlc.prev_hop.outpoint, node_id, htlc.prev_hop.channel_id))
} else {
None
}
});
Copy link
Contributor

@valentinewallace valentinewallace Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't think it matters but this makes more sense in my brain for not having to think about whether a counterparty node id may not be set for all htlcs:

diff --git a/lightning/src/ln/channelmanager.rs b/lightning/src/ln/channelmanager.rs
index 4fe74b3e7..cb4dc94c1 100644
--- a/lightning/src/ln/channelmanager.rs
+++ b/lightning/src/ln/channelmanager.rs
@@ -1093,12 +1093,10 @@ impl ClaimablePayments {
                                        .or_insert_with(|| {
                                                let htlcs = payment.htlcs.iter().map(events::ClaimedHTLC::from).collect();
                                                let sender_intended_value = payment.htlcs.first().map(|htlc| htlc.total_msat);
-                                               let durable_preimage_channel = payment.htlcs.last().map_or(None, |htlc| {
-                                                       if let Some(node_id) = htlc.prev_hop.counterparty_node_id {
-                                                               Some((htlc.prev_hop.outpoint, node_id, htlc.prev_hop.channel_id))
-                                                       } else {
-                                                               None
-                                                       }
+                                               let durable_preimage_channel = payment.htlcs.iter().find_map(|htlc| {
+                                                       if let Some(id) = htlc.prev_hop.counterparty_node_id {
+                                                               Some((htlc.prev_hop.outpoint, id, htlc.prev_hop.channel_id))
+                                                       } else { None }
                                                });
                                                debug_assert!(durable_preimage_channel.is_some());
                                                ClaimingPayment {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're definitely always set. I was gonna even make it required but I think there's cases where we may fail to deserialize an ancient ChannelMonitor that's just hanging around uselessly if we do :/. Using the last entry here was a almost-stupid micro-optimization, blocking the last channel which is getting an HTLC claimed marginally reduces the chance that we actually end up really blocking, as its gonna be the last one to get an update.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the last entry here was a almost-stupid micro-optimization, blocking the last channel which is getting an HTLC claimed marginally reduces the chance that we actually end up really blocking, as its gonna be the last one to get an update.

Let's add this as a comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its so marginal I didn't want to bother lol. Also anyone doing high-latency persist should be doing async updates, where it really won't matter, at least after #3986.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is pretty unclear why we're doing it that way when reading the code at the moment, though.

@TheBlueMatt TheBlueMatt force-pushed the 2025-08-durable-claims branch from 4d12016 to 3887730 Compare August 6, 2025 00:43
@TheBlueMatt
Copy link
Collaborator Author

Rebased.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, feel free to squash

In 0.1 we started requiring `counterparty_node_id` to be filled in
in various previous-hop datastructures when claiming HTLCs. While
we can't switch `HTLCSource`'s
`HTLCPreviousHopData::counterparty_node_id` to required (as it
might cause us to fail to read old `ChannelMonitor`s which still
hold `HTLCSource`s we no longer need to claim), we can at least
start requiring the field in `PendingAddHTLCInfo` and
`HTLCClaimSource`. This simplifies `claim_mpp_part` marginally.
When we claim a payment, `Event::PaymentClaimed` contains a list of
the HTLCs we claimed from as `ClaimedHTLC` objects. While they
include a `channel_id` the pyment came to us over, in theory
`channel_id`s aren't guaranteed to be unique (though in practice
they are in all opened channels aside from 0conf ones with a
malicious counterparty). Further, our APIs often require passing
both the `counterparty_node_id` and the `channel_id` to do things
to chanels.

Thus, here we add the missing `counterparty_node-id` to
`ClaimedHTLC`.
Historically we indexed channels by
`(counterparty_node_id, funding outpoint)` in several pipelines,
especially the `ChannelMonitorUpdate` pipeline. This ended up
complexifying quite a few things as we always needed to store the
full `(counterparty_node_id, funding outpoint, channel_id)` tuple
to ensure we can always access a channel no matter its state.

Over time we want to move to only the
`(counterparty_node_id, channel_id)` tuple as *the* channel index,
especially as we move towards V2 channels that have a
globally-unique `channel_id` anyway.

Here we take one small step towards this, avoiding using the
channel funding outpoint in the `EventCompletionAction` pipeline.
@TheBlueMatt
Copy link
Collaborator Author

Squashed without further changes.

@TheBlueMatt TheBlueMatt force-pushed the 2025-08-durable-claims branch from 3887730 to c1e86fc Compare August 6, 2025 19:36
@tankyleo tankyleo self-requested a review August 6, 2025 21:05
impl_writeable_tlv_based_enum!(EventCompletionAction,
(0, ReleaseRAAChannelMonitorUpdate) => {
(0, channel_funding_outpoint, required),
(0, channel_funding_outpoint, option),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other place I found that could be affected by funding outpoints changing due to splicing is BackgroundEvent::MonitorUpdateRegeneratedOnStartup. We may also want to revisit the HTLC forwarding pipeline to make sure we're not relying on them there either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it looks like we should drop that. It should be okay, though, I think - funding_txo there only goes in to handle_new_monitor_update which then only uses it to pass back to MonitorUpdateRegeneratedOnStartup if we are running during startup...should be an easy cleanup as a followup.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grr, it is needed in the one spot where we need to add an entry to in_flight_monitor_updates and we need the OutPoint to support downgrades to 0.1.

Comment on lines +1096 to +1104
let durable_preimage_channel = payment.htlcs.last().map_or(None, |htlc| {
if let Some(node_id) = htlc.prev_hop.counterparty_node_id {
Some((htlc.prev_hop.outpoint, node_id, htlc.prev_hop.channel_id))
} else {
None
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the last entry here was a almost-stupid micro-optimization, blocking the last channel which is getting an HTLC claimed marginally reduces the chance that we actually end up really blocking, as its gonna be the last one to get an update.

Let's add this as a comment?

@tankyleo tankyleo removed their request for review August 7, 2025 00:23
We added the ability to block `ChannelMonitorUpdate`s on receipt of
an RAA in order to avoid dropping a payment preimage from a channel
that created a `PaymentSent` event in
9ede794. We did not at the time
use the same infrastructure for `PaymentClaimed` events, but really
should have. While a `PaymentClaimed` event may seem a bit less
critical than a `PaymentSent` event (it doesn't contain a payment
preimage that the user needs to make sure they store for proof of
payment), its still important for users to ensure their payment
tracking logic is always correct.

Here we take the (relatively straightforward) action of setting a
`EventCompletionAction` to block RAA monitor updates on channels
which created a `PaymentClaimed` event. Note that we only block one
random channel from an MPP paymnet, not all of them, as any single
channel should provide enough information for us to recreate the
`PaymentClaimed` event on restart.
@TheBlueMatt
Copy link
Collaborator Author

Added a comment about channel selection:

$ git diff-tree -U1 c1e86fcc13 a80c855b21
diff --git a/lightning/src/ln/channelmanager.rs b/lightning/src/ln/channelmanager.rs
index b4eb1f39aa..1fd99f8945 100644
--- a/lightning/src/ln/channelmanager.rs
+++ b/lightning/src/ln/channelmanager.rs
@@ -1095,2 +1095,4 @@ impl ClaimablePayments {
 						let sender_intended_value = payment.htlcs.first().map(|htlc| htlc.total_msat);
+						// Pick an "arbitrary" channel to block RAAs on until the `PaymentSent`
+						// event is processed, specifically the last channel to get claimed.
 						let durable_preimage_channel = payment.htlcs.last().map_or(None, |htlc| {

@wpaulino wpaulino merged commit ed1f304 into lightningdevkit:main Aug 7, 2025
23 of 25 checks passed
@TheBlueMatt
Copy link
Collaborator Author

Backported in #4143

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants